Widespread Family of NAD+-Dependent Sulfoquinovosidases at the Gateway to Sulfoquinovose Catabolism

The sulfosugar sulfoquinovose (SQ) is produced by photosynthetic plants, algae, and cyanobacteria on a scale of 10 billion tons per annum. Its degradation, which is essential to allow cycling of its constituent carbon and sulfur, involves specialized glycosidases termed sulfoquinovosidases (SQases), which release SQ from sulfolipid glycoconjugates, so SQ can enter catabolism pathways. However, many SQ catabolic gene clusters lack a gene encoding a classical SQase. Here, we report the discovery of a new family of SQases that use an atypical oxidoreductive mechanism involving NAD+ as a catalytic cofactor. Three-dimensional X-ray structures of complexes with SQ and NAD+ provide insight into the catalytic mechanism, which involves transient oxidation at C3. Bioinformatic survey reveals this new family of NAD+-dependent SQases occurs within sulfoglycolytic and sulfolytic gene clusters that lack classical SQases and is distributed widely including within Roseobacter clade bacteria, suggesting an important contribution to marine sulfur cycling.

The estimated molecular weight from both spectra is equivalent to a homodimer of each protein is shown in the table.Small amounts of higher order oligomers are also observed in the ArSqgA sample perhaps due to four free, exposed cysteine residues.For BgIT, an Mn 2+ binding site (shown in dark magenta) is observed, with metal interactions with glucose-6-phosphate C2-OH (2.1 Å) and C3-OH (2.5 Å), nicotinamide of NAD + cofactor, and the conserved active site residues Cys162 and His192 (at 2.4 Å and 2.5 Å, respectively).
13 C NMR analysis of culture media: Arthrobacter sp.AK01 were grown to visible turbidity in M9 media containing 5 mM SQGro at 28 °C for 4 days with continuous shaking at 250 rpm.
Cultures were centrifuged at 10000 rpm for 10 min (using Sigma laborzentrifugan model 1-15, rotor 12124) and the supernatant liquid was diluted to 50% D 2 O. 13 C-NMR spectra of diluted supernatant liquid was recorded using a 500 MHz instrument.

Proteomics
Sample preparation for proteomic analysis: Frozen whole bacterial pellets were prepared using the in-StageTip preparation approach as previously described. 3Cells were resuspended in 4% sodium deoxycholate (SDC), 100 mM Tris pH 8.0 and boiled at 95 °C with shaking for 10 min to aid solubilisation.Samples were allowed to cool for 10 minutes and then boiled for a further 10 min before the protein concentration was determined by bicinchoninic acid assays (Thermo Fisher Scientific).50 μg of protein for each biological replicate were reduced/alkylated with the addition of tris-2-carboxyethyl phosphine hydrochloride and iodoacetamide (final concentration 20 mM and 60 mM respectively), by incubating in the dark for 1 hour at 45 °C.Following reduction/alkylation samples were digested overnight with trypsin (1/25 w/w Solu-trypsin, Sigma) at 37 °C with shaking at 1000 rpm.4][5] SDB-RPS StageTips were placed in a Spin96 tip holder 4 to enable batch-based spinning of samples and tips conditioned with 100% acetonitrile; followed by 30% methanol, 1% trifluoroacetic acid followed by 90% isopropanol, 1% trifluoroacetic acid with each wash spun through the column at 1000 x g for 3 min.Acidified isopropanol / peptide mixtures were loaded onto the SDB-RPS columns and spun through, and tips washed with 90% isopropanol, 1% trifluoroacetic acid followed by 1% trifluoroacetic acid in Milli-Q water.
Peptide samples were eluted with 80% acetonitrile, 5% ammonium hydroxide and dried by vacuum centrifugation at room temperature, then were stored at -20 °C.
Reverse phase liquid chromatography-mass spectrometry: Prepared digested proteome samples were re-suspended in Buffer A* (2% acetonitrile, 0.01% trifluoroacetic acid) and separated using a two-column chromatography setup composed of a PepMap100 C 18 20-mm by 75-mm trap and a PepMap C 18 500-mm by 75-mm analytical column (Thermo Fisher Scientific).Samples were concentrated onto the trap column at 5 ml/min for 5 min with Buffer A (0.1% formic acid, 2% DMSO) and then infused into a Orbitrap Fusion Lumos equipped with a FAIMS Pro interface at 300 nl/min via the analytical columns using a Dionex Ultimate 3000 UPLCs (Thermo Fisher Scientific).125-minute analytical runs were undertaken by altering the buffer composition from 2% Buffer B (0.1% formic acid, 77.9% acetonitrile, 2% DMSO) to 22% B over 95 min, then from 22% B to 40% B over 10 min, then from 40% B to 80% B over 5 min.
The composition was held at 80% B for 5 min, and then dropped to 2% B over 2 min before being held at 2% B for another 8 min.The Fusion Lumos Mass Spectrometer was operated in a stepped high-field asymmetric-waveform ion mobility spectrometry (FAIMS) data-dependent mode at two different FAIMS compensation voltages (CVs) -40 and -60.For each FAIMS CV a single Orbitrap MS scan (300-1600 m/z and a resolution of 60k) was acquired every 1.7 seconds followed by Orbitrap MS/MS HCD scans of precursors (stepped NCE 25,35,45%, with a maximal injection time of 54 ms with the automatic gain control set to 250% and the resolution to 30k).

Proteomic data analysis:
Identification and label-free quantification (LFQ) analysis were accomplished using MaxQuant (v1.6.17.0) 6 using the in-house generated proteome of AK01 allowing for oxidation on methionine.Prior to MaxQuant analysis, datasets acquired on the Fusion Lumos were separated into individual FAIMS fractions using the FAIMS MzXML Generator. 7The LFQ and "Match Between Run" options were enabled to allow comparison between samples.The resulting data files were processed using Perseus (v1.4.0.6) 8 to compare the growth conditions using Student's t-tests as well as Pearson correlation analyses.
For LFQ comparisons biological replicates were grouped and missing values imputed based on the observed total peptide intensities with a range of 0.3σ and a downshift of 1.8σ.

Data availability:
The mass spectrometry proteomics data has been deposited in the Proteome Xchange Consortium via the PRIDE partner repository with the data set identifier: PXD043482 Username: reviewer_pxd043482@ebi.ac.ukPassword: peKhCqz5 for 16 hours at 37 °C.A single colony was picked and used to inoculate 10 mL of LB containing 50 µg mL -1 kanamycin and 30 µg mL -1 chloramphenicol.This pre-culture was incubated at 37 °C for 16 hours with shaking at 250 rpm, and then used to inoculate 1 L of LB containing 50 µg mL -1 kanamycin and 30 µg mL -1 chloramphenicol.The culture was incubated at 37 °C, with shaking at 250 rpm, until an OD 600 of 0.6 was reached.The culture was cooled to 18 °C, then 0.5 mM isopropyl thiogalactoside was added and the culture incubated at 18 °C for a further 16 hours.For ArSqgA, E. coli BL21 (DE3) cells were used without chloroamphenicol.The cells were harvested by centrifugation at 5000 x g for 20 min at 4 °C and the supernatant discarded.

Amino acid sequences of targeted enzymes
The pellet was re-suspended in 30 mL buffer A (50 mM Tris-HCl pH 7.5, 300 mM NaCl, 30 mM imidazole, 1 mM DTT) containing EDTA-free protease inhibitor cocktail (Roche cOmplete), 40 µg mL -1 lysozyme and 250 U benzonase nuclease (Sigma Aldrich).The cells were lysed by passage through a French Press (twice), operated at 25 kPsi, and soluble protein was isolated by centrifugation at 20000 x g for 40 minutes at 4 °C.The supernatant was loaded onto a pre-equilibrated HisTrap FF Crude 5 mL column (Cytiva) using an Äkta pure chromatography system (Cytiva).The column was washed with 5 column volumes (CV) of buffer A, then eluted with a gradient of buffer B (50 mM Tris-HCl pH 7.

Figure S4 .
Figure S4.Optimization of reaction conditions for FlSqgA and exploration of specificity.a) HPLC mass spectrometry (triple quadrupole, QqQ) chromatograms showing enzyme is active in the presence of NAD + , DTT and optionally, Mn 2+ .b) Enzyme activity assessed against various PNP glycosides, in the presence or absence of Mn 2+ , and when treated with EDTA.c) pH profile for reaction of 10 mM -PNPSQ with 1.86 µM of FlSqgA.d) Enzyme activity in presence of NAD + , NADP + , or NADH.

Figure S6 .Figure S7 .
Figure S6.Kinetic analysis of ArSqgA and CrSqgA.a) and b) Michaelis-Menten and Lineweaver-Burk plots for reaction rates measured for ArSqgA using -PNPSQ as a substrate.c) and d) Michaelis-Menten and Lineweaver-Burk plots for reaction rates measured for CrSqgA using -PNPSQ as a substrate.

Figure S8 .
Figure S8.Crystal 3D structure of ArSqgA•NAD + complex.Crystal structure showing the dimer pair shown in beige and dark cyan.The interactions with the active site residues within 4 Å vicinity of NAD + are shown on the left.Electron density in blue mesh corresponds to σ Aweighted 2Fo − Fc map contoured at 1σ.

Figure S10 .
Figure S10.Crystal 3D structure of FlSqgA•NAD + •SQ.3D structure showing the physiological dimer pair and interactions of ligands, NAD + and SQ, with the active site residues shown within 4 Å vicinity of SQ.Electron density in blue mesh corresponds to σ A -weighted 2Fo − Fc map contoured at 1σ.

Figure S12 .
Figure S12.Superposition of structures of ArSqgA•NAD + •SQ (in dark cyan) with Mn 2+ /NAD +dependent family GH4 BglT•NAD + •G6P (PDB:1UP6 in grey).Overlay reveals similar Nterminal domain and location of cofactor and substrate binding sites, and highlights differences in the C-terminal domain and active site interactions between the two families.For clarity, only BglT monomer is superposed on ArSqgA dimer (RMS deviation of 3.51 Å over 156 residues).

Figure S14 .
Figure S14.Hidden Markov model strategy for classification of new GH188 family.

Figure S15 .
Figure S15.Plot of centrality closeness values against different alignment score of sequence similarity network.

Figure S17 .
Figure S17.Phylogenetic tree of new GH188 family from Roseobacter clade bacteria.The phylogenetic tree of Roseobacteraceae family were constructed using MEGA (Molecular Evolutionary Genetics Analysis) software and formatted using iTOL (Interactive Tree of Life, https://itol.embl.de/).The size of blue circle at branches indicates the bootstrap proportion for 100 replicates.

4 .
Preparation and analysis of SqgA enzymes Cloning and expression: The vectors encoding FlSqgA (UniProt: A0A7D7VZ79) and ArSqgA UniProt: A0A1C9WRL0) were transformed into E. coli Rosetta(DE3) pLysS or BL21(DE3) cells (Novagen) and grown on Lysogeny Broth (LB) agar plates containing 50 µg mL -1 kanamycin FlSqgA with NAD + : Reaction rates were measured for FlSqgA with constant concentration of -PNPSQ and varied concentration of NAD + .Release of chromogenic 4-nitrophenolate was monitored using a UV/visible spectrophotometer at 405 nm and an extinction coefficient of 7830 M -1 cm -1 under the assay conditions.Reactions were conducted in 50 mM Tris.HCl (pH 7), 20 mM NaCl, 4 mM -PNPSQ, 2 mM DTT, 0.1 mM MnCl 2 at 25 °C using 1.86 μM of FlSqgA at NAD + concentrations ranging from 0.02 to 1.5 mM.The Michaelis-Menten activation constant (K A ) was calculated using Prism (GraphPad Scientific Software).Michaelis-Menten kinetics forFlSqgA using SQGro: a) HPLC conditions for monitoring product formation: Formation of SQ upon cleavage of SQGro was measured using HPLC-ESI-MS/MS analysis.This used a triple quadrupole mass spectrometer (Agilent 6460 QQQ) coupled with Agilent 1260 Infinity Series LC system.The column was XBridge Premier BEH Amide VanGuard FIT Column, 2.5 µm, 4.6 mm x 150 mm.HPLC conditions were from 90% B to 20% B over 10 min; then 20% B for 5 min; back to 90% B in 2 min; 90% B for 10 min (Solvent A: 10 mM ammonium acetate in 1% acetonitrile; Solvent B: 100% acetonitrile); flow rate, 0.50 ml/min; injection volume, 5 µL.The mass spectrometer was operated in negative ionization mode.Quantification was achieved using MS/MS multiple reaction monitoring (MRM) mode of Agilent Mass Hunter Quantitative Analysis software and normalized using α-PNPGlcA as internal standard.The sensitivity for each MRM-MS/MS transition was optimized for each analyte before analysis.b) Measurement of reaction rates: A calibration curve for response to SQ was constructed using varying concentrations of SQ and a constant concentration of internal standard -PNPGlcA.To demonstrate linearity of reaction rates, enzyme reactions for FlSqgA were conducted in 100 μL volumes containing 50 mM Tris.HCl (pH 7), 20 mM NaCl, 1 mM NAD+, 2 mM DTT, 0.1 mM MnCl 2 , 1 mM SQGro and 0.93 μM of VZ79.Reactions were initiated by addition of enzyme and incubated for 1 h at 30 °C.At 20, 40, 60 min time intervals 20 µL of reaction mixture was removed and quenched by heating at 80 °C for 4 min.The quenched reaction mixtures were mixed with 30 µL of internal standard (0.05 mM -PNPGlcA) and analyzed by MS-MS.c) Enzyme Kinetics: Enzyme assays were conducted in 100 μL volumes containing 50 mM Tris.HCl buffer (pH 7), 20 mM NaCl, 1 mM NAD + , 2 mM DTT, 0.1 mM MnCl 2 , and 0.93 μM of FlSqgA and substrate concentrations ranging from 0.5 to 20 mM.Reactions were initiated by addition of enzyme and incubated for 30 min at 30 °C.After 20 min, the reaction was quenched by heating at 80 °C for 4 min.The quenched reaction mixtures were mixed with 30 µL of internal standard (i.e., 0.05 mM -PNPGlcA) and analyzed by MS-MS.Substrate specificity of FlSqgA versus and -PNPGlc and -PNPGlcA: Specificity of FlSqgA was tested against and -PNPGlc and -PNPGlcA using 100 µL reaction mixtures containing 50 mM Tris.HCl buffer (pH 7), 20 mM NaCl, 1 mM NAD + , 2 mM DTT, 0.1 mM MnCl 2 and 10 mM or -PNPGlc or -PNPGlcA at 25 °C using 1.86 µM of FlSqgA.Reactions were monitored continuously for 1 h by recording the absorbance at λ = 405 nm for the release of chromogenic 4-nitrophenolate ion as a product (extinction coefficient was 7830 M -1 cm -1 under the assay conditions).Absorbance was measured using Multimodal Plate Reader (FLUOstar omega, BMG Labtech).Deuterium labelling experiments: Buffers and reagents were prepared in 99.9% D 2 O.FlSqgA was exchanged into deuterated buffer solutions by dialysis using slide-A-lyzer mini dialysis with a nominal molecular weight limit (NMWL) of 10 kDa.Reactions were conducted at 25 °C in 50 mM Tris.HCl (pH 7), 20 mM NaCl, 0.1 mM MnCl 2 , 25 mM -PNPSQ and 2.24 μM FlSqgA in a total volume of 1000 mL.The reaction mixture was incubated at 25 °C for overnight and product formation was measured using HPLC-ESI-MS/MS analysis using a triple quadrupole mass spectrometer (Agilent 6460 QQQ) coupled with Agilent 1260 Infinity Series LC system.The column was XBridge Premier BEH Amide VanGuard FIT Column, 2.5

.
Data collection and refinement statistics.Numbers in brackets refer to data for highest resolution shells.

Table S2 .
Data collection and refinement statistics.Numbers in brackets refer to data for highest resolution shells.

Table S3 .
PFAM codes for neighborhood genes from different sulfoglycolytic and sulfolytic pathways.